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ABSTRACT 

We present and analyze the positions, distances, and radial velocities for over 
4000 blue horizontal-branch (BHB) stars in the Milky Way's halo, drawn from 
SDSS DR8. We search for position-velocity substructure in these data, a signa- 
ture of the hierarchical assembly of the stellar halo. Using a cumulative "close 
pair distribution" (CPD) as a statistic in the 4-dimensional space of sky position, 
distance, and velocity, we quantify the presence of position-velocity substructure 
at high statistical significance among the BHB stars: pairs of BHB stars that 
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are close in position on the sky tend to have more similar distances and radial 
velocities compared to a random sampling of these overall distributions. We 
make analogous mock-observations of 11 numerical halo formation simulations, 
in which the stellar halo is entirely composed of disrupted satellite debris, and 
find a level of substructure comparable to that seen in the actually observed BHB 
star sample. This result quantitatively confirms the hierarchical build-up of the 
stellar halo through a signature in phase (position-velocity) space. In detail, the 
structure present in the BHB stars is somewhat less prominent than that seen in 
most simulated halos, quite possibly because BHB stars represent an older sub- 
population. BHB stars located beyond 20 kpc from the Galactic center exhibit 
stronger substructure than at r gc < 20 kpc. 

Subject headings: galaxies: individual (Milky Way) — Galaxy: halo — Galaxy: 
structure — stars: horizontal-branch — stars: kinematics and dynamics 



1. Introduction 



The current hierarchical structure formation paradigm implies that the formation of our 



Milky Way entailed a sequence of dark matter driven accretion and merger events (Searle 



& Zinn 1978 White & Rees 1978; Blumenthal et al. 1984). This naturally results in the 



expectation that the stellar halo should be largely built up from stars of tidally disrupted 
satellite galaxies, resulting in substructure that may appear as stellar streams with different 



Bullock et al. 


2001 


Cooper et al. 


2010b 


Bullock & Johnston 



2005, hereafter BJ05). Because stars are gravitationally collisionless systems, their phase- 
space (spatial and velocity) distributions encode and retain aspects of their origin. This 
implies that an analysis of substructure in the position-velocity distribution of stars in the 
halo is a direct test for hierarchical models of galaxy formation. 

In the past decades, observational evidence of spatial substructure has indeed been 



found in the Milky Way, both near the Sun (Majewski et al. 1996 Helmi et al. 1999) and 



at larger distances (Ibata et al. 1994, 1995). The most prominent example is the discovery 



of the Sagittarius dwarf galaxy (Ibata et al. 1994, 1995; Yanny et al. 2000) and its trails of 



debris (Ibata et al. 2001 Majewski et al. 2003) 



In nearby samples of stars, where the full 6D phase-space coordinates can be measured, 
substructure in the stellar distribution is seen in velocity space, or even in the integrals of 



motion (Dehnen & Binney 1998 Helmi et al. 1999 Klement et al. 2008 2009 Morrison 



et al. 


2009 


Smith et al. 


2009) 
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~ 20 kpc, individual transverse velocities are all but impossible to measure from proper 
motions with current technology. The available observables are therefore the position in the 
sky, a distance estimate from photometric or spectroscopic luminosity determinations, and 
line-of-sight velocity: a,8,d, and Vi os . When averaged over large angular areas and broad 
distance ranges, the line-of-sight kinematics of the Milky Way halo stars at 10-50 kpc are 
well-described by a simple Gaussian with o"i os ~ 111 km s _1 (Xue et al. 2008, hereafter 
X08). However, because the stellar halo is collisionless, preserving phase-space density, 
substructure in position space necessarily implies substructure in velocity space. Recent 



work by Starkenburg et al. (2009) and De Propris et al. (2010) indicate that the Milky 



Way's stellar halo indeed possesses detectable position-velocity substructure. Schlaufman 



et al. (2009) have shown that metal-poor halo stars within ~ 17.5 kpc from the Sun exhibit 



clear evidence for velocity clustering on very small spatial scales (which the authors refer to 
as Elements of Cold HalO Substructure, or ECHOS). 

With the development of large-scale sky surveys, such as the Two Micron All Sky 



Survey (2MASS; Skrutskie et al. 2006), the Sloan Digital Sky Survey (SDSS; York et al. 
Stoughton et~aL]|2002| |Abazajian eTal] [20031 |20~04l [20051 [20091 |Adelman-McCarthy 



2000 



et al. 2006, 2007, 2008), and the follow-up SEGUE survey (Yanny et al. 2009b), we have an 



unprecedented opportunity to examine Milky Way halo streams in detail (Ivezic et al. 2000 



Yanny et alT] [20001 |Newberg et""aTl|2002l |Majewski et~aLl|2003l | Yanny et all [20031 |Newberg 



et al.||2007 Yanny et al.||2009a ). Halo star samples are now of sufficient size and quality that 
a direct statistical comparison with models, such as BJ05, has become possible. 

A first quantitative comparison indicated that the observed level of spatial substructure 
(on all scales) is similar to that expected from those simulations, where the halo is composed 
entirely of disrupted satellites (Bell et al. 2008, hereafter B08). Imaging surveys of M31 



( Ibata et al.||2007 ) have revealed a similarly rich set of substructure in the stellar halo of that 
galaxy. Based on photometry of main sequence turn-off (MSTO) stars, B08 constructed 
a coarse 3D map of the stellar halo density, with almost a factor of two uncertainty in 
distances. BHB stars are a much rarer tracer of the old metal-poor population, but have 



the great advantage of being luminous, with M, 



+ 0.7 (vs. M 6 



3.5 for MSTO 



stars) and of having precise distance estimates (~ 5%; X08). BHB stars have also been a 
special spectroscopic target class in SDSS and SEGUE (e.g., Yanny et al. 2009b). Hence, 



the sample of possible BHB stars with spectra from SDSS constitutes by far the largest set of 
luminous tracers (extending to distances of ~ 80 kpc) of the Milky Way's stellar halo with 
available four dimensional (a, 5, d, Vi os ) information, where the distances are accurate to 5% 
and the radial velocities accurate to 5 ~ 20 km s -1 . This sample enables the first attempt at 
checking that the statistical properties of kinematics matches (or not) model expectations. 
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This paper describes a large sample of probable BHB stars with measured kinematics, 
and presents an exploration of how to quantify position-velocity substructure in the Milky 
Way's stellar halo in order to compare the observation to simulations such as from BJ05. 
It is certainly possible to pick out the kinematic signature of the Sagittarius stream (e.g., 



Ibata et al. 2001; Starkenburg et al. 2009) in these data. However, what we aim for here 



is to devise a simple objective measure for quantifying such substructure. Specifically, we 
employ the close pair distribution (CPD) statistic, F = Wg9 2 + WAd(Ad) 2 + u>av ;os (AV; os ) 2 , 



to detect substructure, following Starkenburg et al. (2009). Here, 9, Ad, AVi os are the 



angular, distance, and velocity separation of pairs of stars, and Wq, WAd, and WAv ios are 
suitable weights. The idea is that a structured or "clumpy" position-velocity distribution 
will have more pairs with small F than a suitably chosen random distribution. As also 
argued by B08, it is important for quantitative data-model comparisons to have a general 
statistical measure of substructure, rather than specifically searching for (here, kinematical) 
substructure associated with a particular feature, such as the Sagittarius stream, so we also 
explore what we should expect from the BJ05 models, and compare with the observations. 

This paper is organized as follows. In Section 2 we present the sample of BHB stars. 
Section 3 provides the definition of the close pair distribution (CPD) as a statistic, and 
describes its application to the sample of BHB stars. The analogous CPD for the BJ05 
simulations and their statistical analysis is presented in Section 4. Conclusions from the 
comparisons between observations and simulations are presented in Section 5. 



2. The Spectroscopic Sample of BHB stars from SDSS DR8 



SDSS-I was an imaging and spectroscopic survey that began routine operations in April 
2000, and continued through June 2005. The SDSS and its extensions are using a dedicated 



2.5m telescope (Gunn et al. 2006) located at the Apache Point Observatory in New Mexico. 
The Sloan Extension for Galactic Understanding and Exploration (SEGUE) is one of the 
three key projects (the legacy survey, the supernova survey, and SEGUE) in the recently 
completed first extension of the Sloan Digital Sky Survey, known collectively as SDSS-II. The 
SEGUE program, which ran from July 2005 to July 2008, obtained ugriz imaging of some 



3500 deg 2 of sky outside of the SDSS-I footprint (Fukugita et al.|1996 Gunn et al.|1998 


2006 


York et al. 2000 Hogg et al. 


2001 Smith et al. 2002 


Stoughton et al. 2002| Abazajian et al. 


2003 


2004 2005, 2009 Pier et al. |2003; Ivezic et al. 


2004 Adelman-McCarthy et al. 


2006, 


2007 


2008 Tucker et al.[|200( 


3), with special attention being given to scans of lower Galactic 



latitudes (\b\ < 35°) in order to better probe the disk/halo interface of the Milky Way. 
SEGUE obtained some 240,000 medium-resolution spectra of stars in the Galaxy, selected 
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to explore the nature of stellar populations from 0.5 kpc to 100 kpc (Yanny et al. 2009b). 



SDSS-III, which is presently underway, has already completed the sub-survey SEGUE-2, 
an extension intended to obtain an additional sample of over 120,000 spectra for distant 
stars that are likely to be members of the outer-halo population of the Galaxy. Data from 



SEGUE-2 has been distribution as part of the eighth public data release, DR8 ( |Aihara et al. 



2011). 



The SEGUE Stellar Parameter Pipeline processes the wavelength- and flux-calibrated 



spectra generated by the standard SDSS spectroscopic reduction pipeline (Stoughton et al. 



2002), obtains equivalent widths and/or line indices for more than 80 atomic or molecular 



absorption lines, and estimates T eff , log g, and [Fe/H] through the application of a number 
of approaches (see Lee et al. 2008a[b Allende Prieto et al. 2008; Smolinski et al. 2011). 



We construct a sample of BHB stars from SDSS DR8 with spectra in a fashion very 
similar to X08. The spectra are used both to classify stars as BHB and to obtain measured 
radial velocities. In essence, we combine an initial color cut for BHB candidates with Balmer- 
line profile shape measurements. The S/N of the spectra affects the precision of Balmer- 
line profile shape measurements. Therefore, spectra are only accepted when the fractional 
variance between the best-fitting profile and observed Balmer-line profile is ^ 0.1. The color 
cuts that we used in this paper are: 



0.8 < u-g < 1.5 
-0.5 < g - r < 0.0 
The Balmer-line profile cuts used are: 

for the B.5 line : D . 2 < 29 A, f m ^ 0.35 

for the H 7 line : 0.75 < c 7 < 1.25, 7 A < b 7 < 10.8 - 26.5 (c. 



1.08) 



where D0.2, fm, c, and b are the width of the Balmer line at 20% below the local continuum, 
the flux relative to the continuum at the line c ore, and the para meters of the Sersic profile, 



y = 1.0 — aexp 



|A-A()| 
b 



Sirko et al. 



2004 



respectively (see 
The following are the Balmer-line profile cuts used in X08: 
for the R5 line : 17 A ^ D . 2 < 28.5 A, 



X08). 



0.1 < f m < 0.3 



for the H7 line : 0.75 < c 7 < 1.25, 



7 A < b 7 < 10.8- 26.5 (c 7 



1.08) 



We retain the color cut (Yanny et al. 2000[ ), but slightly relax the Balmer-line profile 
cuts compared to X08, as illustrated in Figure [TJ Since the two Balmer-line profile cuts 
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are independent, the relaxed criteria on the H<5 line should introduce little additional con- 
tamination, but overall it makes the criteria less stringent. As compared with our previous 
criteria (X08), the relaxed criteria will have minimal impact on the following substructure 
analysis. For further details on the sample selection, we refer the interested reader to X08 
and references therein. 

By selecting stars that satisfy the color cuts and both Balmer-line profile cuts, we obtain 
a sample of 4985 stars from SDSS DR8 with high BHB probability (see Table 1 for example), 
of which there are 4625 halo BHB stars with |Z| > 4 kpc and r gc < 60 kpc and 26 halo BHB 
stars between 60 and 80 kpc. Figure [2] shows the sky coverage and spatial distribution of 
these 4625 halo BHB stars. Distances were derived from the magnitudes and colors as in 
X08. The line-of-sight velocities, V/ os , are converted from Local Standard of Rest frame 
to the Galactic Standard of Rest frame by adopting a value of 220 km s _1 for the Local 
Standard of Rest (Vi sr ) and a solar motion of (+10.0, + 5.2, + 7.2) km s _1 , as in X08. 



Small changes that may arise from adopting a different V C i r (Ro) pair, e.g., Bovy et al. (2009) 
or Koposov et al. (2010), do not matter for the subsequent analysis. 

This sample of halo BHB stars has radial velocity errors of 5-20 km s _1 and much more 
accurate distances than other distant halo stars with available kinematic information. For 
instance, distances are ~ 4x more accurate than in the sample of halo giants recently used 



by Starkenburg et al. (2009) in a search for distant halo substructure, and our sample is 



50-fold larger. Schlaufman et al. (2009) discussed a sample of ~ 10,000 metal-poor main 
sequence turnoff (MPMSTO) stars with distances greater than 10 kpc from the Galactic 
center, with vertical distance |Z| more than 4 kpc, and with distances less than 17.5 kpc 
from the Sun, to identify ECHOS in the inner halo. By comparison, our sample extends to 
four times larger distance. 



The cumulative distribution of the BHB stars with r gc shown in Figure [3] indicates that 
about 95% of the BHB stars have r gc < 40 kpc, so that any estimate of substructure should 
be dominated by the BHB stars within this distance. For a cleaner selection function, we 
use only the 4243 BHB stars with |Z| > 4 kpc and r gc < 40 kpc in the following analysis, 
which is still sufficiently distant to enable tests for substructure well into the outer halo. 



3. The Close Pair Distribution of BHB stars in DR8 

We now turn to quantifying the presence of any kinematic substructure. There is no 
unique choice of a substructure statistic, nor is there a rigorous way to derive one without 
making very specific assumptions about the nature of the underlying distributions. For 
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kinematically cold streams that are not strongly phase-mixed, a "pairwise velocity difference" 
(PVD), (\AVi os \)(Ar^ c ), could conceivably be used to detect velocity substructure. It is 
expected that (|AVj os |) should be lower for small separations Ar~^ c in stellar streams, where 
adjacent stars have similar velocities. However, as many streams in simulated halos are 
phase-wrapped, the PVD proved not to be very suitable to quantify substructure, even in 



simulated halos where all stars arise from disrupted satellites (see Xue et al. 2009). 

For distant large scale features, a great circle method ( Lynden-Bell fc Lynden-Bell||1995 



Palma et al. 2002) may be appropriate. We have chosen not to use such a method in this 



paper for two reasons. Most importantly, BJ05 and Johnston et al. (2008) demonstrate that 



many halo structures (in particular older structures or those on more radial orbits) do not 
have a great circle geometry, and we wish to explore the degree of halo structure in a way 
that is sensitive in principle to a broader range of geometries. Secondly, the spectroscopic 
coverage is sparse and covers only a fraction of the sky, and we lack useful proper motion 
information for our sample, making a great circle analysis more challenging to implement. 



As an alternative to the PVD and a great circle analysis, we follow Starkenburg et al. 



(2009) and De Propris et al. (2010) in exploring a statistic that focuses on the incidence 



of close pairs in (a, 8, d : Vi os ) space (a similar approach was developed by Doinidis h Beers 



1989). Specifically, we define the separation between two stars i and j as: 



F 



weOij + w Ad (di - dj) 2 + w AVlos (Vios,i - V t 



los,j i 



where 



cos Qij = cos bi cos bj cos(/j — L) + sin bi sin bj, 



2 ) 



((Ad) 2 ) 



■ w AV los 



((A^ os ) 2 ) 



and where (...) refers to the average over all pairs. 

While the angular separation combines the galactic longitude and latitude in the dis- 
tance measure F, the heliocentric distance and line-of-sight velocity are used as independent 
variables... . In the definition of F, the weights wg; w A a] w AVi os , are solely used to create a con- 
sistent metric for the angular, distance and velocity dimensions of F through normalizing each 
dimension by the ensemble average of separation, ((6 2 ) = 6928 squaredegree) , distance dif- 
ference ( {(Ad) 2 } = 127 kpc 2 ), line-of-sight velocity difference (((AVi os ) 2 } = 22455 kmV 2 ). 



Starkenburg et al. (2009) have pointed out that this algorithm is quite insensitive to small 



changes in the weighting factors. We use somewhat different weighting factors from them, but 
still detect obvious substructure signal in the BHB sample presented here. In parallel work 



to this paper Cooper et al. (2010a) applied the algorithm and weighting factors as Starken 



burg et al. (2009) to 2400 BHB stars and found obvious substructure signal. Therefore, the 
choice of weighting factors seems not to be a critical aspect of sub-structure quantification. 



If position- velocity substructure is present, we expect that the distribution of Fij for the 
observed sample has more close pairs than the null hypothesis (defined below) of a smooth 
halo where positions and velocities are uncorrelated: N obs (< F) > N(< F ). This is most 
conveniently captured in the cumulative distribution of the F^, N(< F), as illustrated in 
Figure [4} 

This null hypothesis assumes that the halo can be described by some spatial density 
distribution, Pbhb(?), an d a velocity distribution where o~i os does not depend on the partic- 
ular position. Indeed, averaged over all angles, o~i os ~ 111 km s -1 is observed to be nearly 
constant as a function of radius (X08). In its angular distribution and its distance distribu- 
tion, the sample selection function of our BHB sample is very complex (see, e.g., Figure |2]for 
the angular distribution). However, stellar radial velocities are uncorrelated to the sample 
selection, and it is reasonable to assume that the distance selection of the stars in the same 
part of the sky are independent realizations of the overall distance (or, apparent magnitude) 
distribution. Consequently, we cannot randomize 9 when constructing the null hypotheses. 
As our null hypothesis, we can only independently draw random Ad and AVi os . Specifically, 
we do this by scrambling only the distances and velocities within the sample to create the 
null hypothesis, but leave the angular position unchanged: 

F ,ij = wqB\ + w d {d ir - d jr ) 2 + w V[os (Vi OStir - Vi osJr ) 2 , (2) 

where wg, Wd, u>v los , and the indices i and j are exactly the same as in F^, but i r and j r are 
random indices chosen within an angld^of 45° from stars i and j (here, i r and j r are different 
and independent in their distance and velocity terms). 

Now we can search for posit ion- velocity substructure by comparing N t s (< F) for our 
BHB sample to the distribution of 100 Monte Carlo representations of N(< F ). Figure [4] 
shows that N obs (< F) exceeds N(< F ) at high significance for small F, logF < —2 (for 
example, Ad < 1.5 kpc, AVi os < 15 km s -1 and 9 < 8° corresponds to logF < —2). 

Small values of F represent close pairs in position- velocity space. So, Figure [4] demon- 
strates that the observed sample has many more close pairs than the null hypothesis, reflect- 
ing the existence of position- velocity substructure in the BHB sample. For small F one might 



Angular spacing comparable to or larger than the SDSS footprint may be hard to interpret, so we choose 
45° to avoid this. 
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expect N(< F ) oc Fq for the null hypothesis, but the plot shows a somewhat shallower slope, 
presumably arising from the non-random way that stars are sampled by SDSS spectroscopy 
from the celestial sphere. The widely spaced SEGUE- 1/2 spectroscopic plates result in the 
sparse, but locally dense, angular sampling. In addition, we can learn from Figure [4] that 
the CPD statistic focuses on < 0.1% close pairs rather than all pairs of the sample, implying 
that the CPD may be more sensitive to the presence of substructure than the PVD statistic 
(see 



Xue et al. 2009). 



The left panel of Figure [4] shows that there are only 30 pairs with logF < —3, compared 
with total pairs of 8.9 x 10 6 . Therefore, we just analyze the behavior at logF > —3 in the 
subsequent analysis. 

As shown in Figure |5j the substructure signal comes both from the distance and the 
line-of-sight velocity domain. This is apparent if we either scramble only the distances or only 
the velocities between N b<,(< F) and N(< F ). In both cases an excess of small separation 
pairs is present at a comparable level. 



The recent studies of Carollo et al. (2007) and Carollo et al. (2010), based on local 



samples of halo stars, indicate that our Milky Way's stellar halo is complex, and can be 
described by at least two components - denoted as an "inner" and an "outer" halo, with 
different kinematics, distributions of orbital eccentricity, inferred spatial profiles, and peak 
metallicities. In such a decomposition, the inner halo component dominates the region of 
5 kpc < r gc < 10 kpc, while the region of r gc > 20 kpc is dominated by the outer halo. 
Direct in situ evidence for stellar metallicity changes with distance has also been found in 



photometry from the SEGUE vertical stripes (de Jong et al. 2010). 



As dynamical timescales are longer at large distances, we would expect a more clear 
substructure signal in the outer parts of the halo. To test this, we include here the BHB 
stars with 40 kpc < r gc < 60 kpc. We divide the BHB sample into two parts - subsample 
I with 5 kpc < r gc < 20 kpc, and subsample II with 20 kpc < r gc < 60 kpc, both with 
\Z\ > 4 kpc, and compare the substructure signals in the two subsamples. Figure [6] shows 
that both show significant deviations from the null hypothesis. Yet, subsample II shows a 
stronger clustering excess in 4D space than subsample I. This suggests that the substructure 
is stronger (e.g. less phase-mixed) in the outer halo (subsample II) than in the inner halo 
(subsample I). 

As mentioned in the introduction, we are more interested in a general statistical measure 
of substructure for quantitative data-model comparison than in the search for substructure 
associated with a particular feature; N(< F) appears as a useful statistic in this context. 
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4. Position- Velocity Substructure in the BJ05 Models 

Having detected a general substructure signal, we now compare this to expectations for 
N(<F) from cosmological models where the entire stellar halo is made of disrupted satellite 
galaxies. 

BJ05 published models for the formation of the stellar halo of the Milky Way system, 
arising solely from the accretion of ~ 100 — 200 luminous satellite galaxies in the past ~ 12 
Gyr. They used a hybrid semi-analytic plus N-body approach that distinguished explicitly 
between the evolution of baryonic matter and dark matter in accreted satellites. For further 



details of the simulations, we refer the interested reader to BJ05, Robertson et al. (2005 ), Font 



et al. (2006), and references therein. There are 11 simulated halos provided by the Bullock 



& Johnston study, which can be obtained from \http://www. astro . Columbia, edu/^kvj/halos/ 



The simulations produce a realistic stellar halo, with mass and density profiles much like 
that of the Milky Way (e.g. B08), and with surviving satellites matching the observed 
number counts and structural parameter distributions of the satellite galaxies of the Milky 



Way. Sharma et al. (2011) pulished a code to generate a synthetic survey of the Milky Way 
based on BJ05 simulations. Given one or more color-magnitude bounds, a survey size, and 
geometry, this code can return a catalog of stars in accordance with a given model of the 



Milky Way. The Galaxia code will be released publicly at http://galaxia.sourceforge.net. we 



refer the interested reader to Sharma et al. (2011) and references therein 



To start, we assume that BHB stars are representative tracers of the overall popula- 



tion of old, metal-poor, halo stars (see, however, Bell et al. 2010). We then make "mock- 
observations" of the BJ05 simulations, analogous to those presented in Section 2 and ana- 
lyzed in Section 3. In brief, we do this by accounting for the particular survey volume of 
SDSS DR8, the angular separation distribution, and approximate distance distribution of 
the BHB sample, accounting for the luminosity weight of the simulated particles, and by 
adding observational uncertainties for distance and velocity. 

From the simulations, we obtain the particle's 3D positions and 3D velocities in the 
Galactic standard of rest frame, luminosities L, and ages. We transfer these to Galactocentric 
line-of-sight velocities, Vi os , and sky positions (Galactic longitude and latitude, (/,&)), by 
taking the Sun's position as (8.0, 0.0) kpc. The probability of a particle being drawn is 
proportional to the assigned particle luminosity. The SDSS fibers cannot be placed closer 
than about 55 arcsec. However, BHB stars are relatively sparse on the sky ~ 1 per square 
degree, so possible fiber collisions play no role in the clustering analysis. E.g. fewer than 10~ 5 
of all possible pairwise angular separations in the sample fall within < 1'. We also consider 
the spectroscopic sky coverage of SDSS DR8, distance limits (|Z| > 4 kpc, r gc < 40 kpc), and 
the angular separation distribution of the observations. These procedures essentially follow 
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those used by X08. 

Based on the particles with the same sky coverage as SDSS DR8 and the same distance 
limits as the BHB sample, we randomly select a particle within an angle |^]of 1.2° from each 
BHB star % in the sample (where % = 1...4243). This selected particle of luminosity L is 
accepted with a probability of < L/L max , where L max is the maximum luminosity of the 
simulated particles. We also convolve the distances of the mock-observations with an error 
of 5%; the radial velocities are convolved with a Gaussian error of a =5 km s _1 . 

This procedure results in mock-observations of 4243 star particles in the simulations 
that are in the same sky region as SDSS DR8, have a similar angular separation distribution 
to the BHB sample, have the same distance and velocity uncertainties as the BHB sample, 
and have distance limits of \L\ > 4 kpc, r gc < 40 kpc, and satisfy the luminosity weighting 
scheme. The spatial distributions for 11 mock-observations are shown in Figure [7j along 
with the spatial disribution for BHB sample. These mock-observations allow us to consider 
the CPD for the BJ05 simulations. We calculate F for the mock-observations and 100 sets 
of the null hypothesis, F , in each of the 11 simulations. 

The upper panel of Figure [8] shows the close pair distribution for the observed BHB 
sample and the 11 mock-observations. Overall, the observations fall well within the range 
of expectation from the BJ05 simulations, but the simulations have somewhat more sub- 
structure in the realm logF ~ -2.5 to -1. The ratio of the cumulative distribution of the 
mock-observations for all 11 simulated halos of BJ05 are shown in lower panel of Figure [8j 
along with the ratio of the cumulative disribution of the actual data. Inspection of this Fig- 
ure reveals that iV bs(< F) differs significantly from N(< Fq) for all halos, in the sense that 
N bs(< F) > N nu u(< F) at least for < 1% of closest pairs. The strength of the CPD signa- 
ture varies quite strongly among different simulations. Overall, the ensemble of simulations 
show qualitatively the same signature of position-velocity substructure as seen in the real 
data. Moreover, the large majority of simulations exhibit stronger signals than observation 
for —3 < logF < —1 (except 2 BJ05 halos shown as lower panel of Figure [8]). In particular, 
a significant substructure signal can be traced in the mock-observations to logF ~ —1 (e.g., 
Ad < 4.5 kpc, AVi os < 65 km s^ 1 , and 9 < 26° for logF < —1) for most of the simulations, 
while for the observations the substructure signal can only be traced until logF ~ — 2. This 
indicates some quantitative data-model difference. 

The BHB sample qualitatively exhibits the same signature of position-velocity substruc- 
ture as the BJ05 halos, which are by construction 100% substructured. This implies that a 



2 The 1.2° angular distance was found to be the smallest angle that can ensure there is at least one particle 
that can be accepted around star i. 
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large fraction of the Milky Way halo is associated with substructure. However, it is difficult 
to quantify the fraction of BHB stars in the Milky Way that are in substructures, because 
the variation among different all-substructure simulations is so large. At any rate, it has to 
be a substantial fraction. 

Taken at face value, the tendency of the overall stellar halo of cosmologically-motivated 
models to have exhibit more substructure than observed BHB stars, may indicate that the 
models with their rigid halos may somewhat overpredict the degree of substructure; or this 
comparison may imply that some fraction of the Milky Way's halo stars did indeed form in 



an early dissipational component (Hammer et al. 2007 Shen et al. 2010). 



An alternative explanation may be this possibility, and noting that BHB stars do not 
fairly trace the overall halo star populations. To test that BHB stars occur in very old, metal 
poor populations, and we have tested if older star particles from the BJ05 simulation are 
distributed differently from all particles. To explore why the models might be somewhat more 
highly structured, especially for larger F, we compare the F distributions for the observation 
and the particles older than 11 Gyr in the BJ05 halos, where the age refers to the formation 
of the star particle, not the time since disruption of its host satellite. Figure [9] shows that 
the observed substructure signal is comparable with those detected in simulations (Except 
for 2 BJ05 halos). BHB stars are known to represent a very old population, so this may be 
an astrophysically reasonable explanation for the data-model difference. 

Another possibility is that, since the models do not follow mergers self-consistently (i.e., 
the Milky Way's potential grows only smoothly and analytically), the response of the Milky 
Way to infalling objects could serve to disrupt and scatter streams, thereby decreasing the 
importance of substructure. This could be checked in the future with simulations such as 



Cooper et al. (2010b). 



As in the analysis of the BHB sample, we also make mock-observations of the inner- 
and outer-halo regions (here, the mock-observations have similar sky densities to the BHB 



sample), and calculate F and F for the mock-observations. Figure 10 shows that, for most 
BJ05 halos (except halo08), the outer halo (20 kpc < r gc < 60 kpc) exhibits a stronger 
substructure signal than the inner halo (5 kpc < r gc < 20 kpc), consistent with the actual 
observations. 

The mock-observations show that the CPD deviates from the null hypothesis in the 
simulations in a qualitatively similar fashion as the actual observations. At first sight, this 
seems to be a straightforward extension into the kinematic domain of the conclusion reached 
by B08, that the stellar halo exhibits a level of substructure consistent with the stream-only 
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models of BJ05j^] The signal is, however, weaker than that seen in mock-observations drawn 
from the BJ05 simulations. 



Taken together, Figure |3J Figure [7j Figure [HJ Figure [9] and Figure 10 lead to our four 



results: 1) In a sample of > 4000 BHB stars from SDSS DR8 there is a very clear signal for 
position-velocity substructure in the Milky Way's halo stars - close angular pairs of stars have 
smaller velocity and/or distance differences than expected for an uncorrelated distribution. 
2) The outer part of the Milky Way's halo (r gc > 20 kpc) exhibits a statistically stronger 
kinematic substructure signal than the inner halo (r gc < 20 kpc). 3) Mock-observations of 
simulated halos BJ05, made exclusively from disrupted satellites, exhibit a qualitatively very 
similar behavior - N t, s (< F) >N(< F ) for logF < —1. 4) Quantitatively, most simulations 
produce a stronger signal, especially one extending to larger scales (i.e., larger F). However, 
if we identify BHB stars within the simulated halo population with t age > 11 Gyr, the levels 
of substructure are consistent. Given other evidence that BHB stars are most abundant in 
very old populations, this seems perhaps astrophysically more plausible than the alternative 
of postulating a very quiet formation of the Milky Way's halo. 



5. Summary and Conclusions 

In the context of current cosmogonic models, the stellar halos of galaxies like our Milky 
Way are expected to be comprised, to a large degree, of debris from disrupted satellite 
galaxies. After disruption, the dispersing stars will form recognizable streams for some time, 
but may eventually phase-mix beyond easy recognition. There has been recent evidence 
(B08) that the degree of spatial substructure actually seen in the Milky Way's halo matches 
that of simulations (e.g., BJ05), where the stellar halo arises exclusively from disrupted 
satellites. Due to phase-space conservation, the same scenario qualitatively predicts the 
existence of a position-velocity correlation. In this paper, we have pursued a quantitative 
statistical approach to understanding how the Milky Way's stellar halo compares with this 
scenario. 

It has already been established in the published literature that several prominent sub- 
structures exist in the Milky Way's stellar halo, most notably the Sagittarius stream. The 
next step forward is to find simple, robust statistical measures to quantify the level of sub- 
structure in order to allow direct comparison with theoretical models (such as BJ05). There 
is certainly no established procedure, and there may be no unique way to establish such a 



3 In Bell et al. (2008) the BJ05 models are labeled 1-11 in strict numerical order, so that the interested 
reader can compare B08 and this paper side-by-side. 
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statistic. For example, in pure position space, B08 simply took the rms deviation of the 
density from a underlying power-law model. In this paper we have considered a statistic for 
diagnosing position-velocity correlations - the close pair distribution (see Starkenburg et al. 



2009). 



Building on recent initial attempts (Starkenburg et al. 2009 Xue et al. 2009 Harrigan 



et al. 2010; De Propris et al. 2010), this paper presented a more comprehensive attempt 



to quantify the position-velocity substructure of the Milky Way's stellar halo, using BHB 
stars from SDSS, and to compare it to cosmological models. We calculated the close pair 
distribution (CPD) as a function of distance separation, angular separation, and velocity 
separation between pairs of stars. Qualitatively, the signal we were looking for is that 
the observations have significantly more close pairs than an ensemble of null hypotheses, 
where the position and velocity have no correlation. Using this CPD (i.e., the cumulative 
distribution N Q b s (< F), where F is the four-distance in angle, distance, and velocity), we 
found that a sample of over 4000 BHB stars in the halo of the Galaxy exhibit far more close 
pairs than the null hypothesis, which demonstrates the existence of real substructure. This 
result is perhaps not surprising, as some level of substructure is already known to exist (see 



also Starkenburg et al. 2009 De Propris et al. 2010 Harrigan et al. 2010). However, as 



a statistical quantification, it draws on a sample 6-60 times larger than previous analyses 



(Starkenburg et al. 2009 De Propris et al. 2010), and arrives at statistically very clear- 



cut inferences. We also constructed mock-observations of simulated stellar halos that are 
made exclusively of disrupted satellite galaxies. These mock-observations match the angular 
sampling of the SDSS data in detail, and also match the distance cuts applied to the data. 
Comparing, analogously, N Q b s (< F) to N(< F ), we found the qualitatively same signature 
of substructure as in the observed sample. Quantitatively, the observed signal is weaker 
than that seen in the mock-observations, where the stellar halo is entirely made of disrupted 
satellites. Assuming that BHB stars are random tracers of the stars in the simulations, we 
impose a lower age limit of 11 Gyr in producing mock-observations, and found comparable 
levels of position-velocity substructure between observation and simulations. Therefore, 
there are two ways to reconcile the data-model differences: either to infer differences in 
the dynamical formation histories between the simulated and the observed Milky Way, or - 
more plausible in our view - attributing it to the fact that BHB stars are overrepresented 
in the oldest sub-populations of the stellar halo. For both the observations and the mock- 
observations we compared the substructure signals associated with the inner and outer halos, 
and found good agreement between data and model - the outer halo exhibits a stronger 
substructure signal than the inner halo. 



Within the context of SDSS data, the next level of understanding kinematic substructure 
in the Milky Way's outer halo will come from samples of more representative giant stars with 
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good distances. How the results from BHB stars presented here relate to the substructure 
seen in main-sequence samples of the inner halo (Schlaufman et al. 2009| ECHOS) remains 
to be resolved. A more recent generation of simulations (e.g. Cooper et al.| 2010b) will also 
permit more far reaching and robust conclusions. 
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Fig. 1. — BHB star sample selection, based on the Balmer-line shape parameters (see Section 
2). The left-hand panel shows the US line parameters f m and D0.2, divided into three regions 
(following X08 and Sirko et al. 2004): stars with f m > 0.35 are too cool to be BHB stars 
- they are likely main-sequence stars; the concentration of stars with D 2 > 29A is likely 
due to blue stragglers (BS) with higher surface gravity; the region with f m < 0.35 and 
Do. 2 ^ 29A is used as the BHB selection criterion for the H<5, A). 2, and f m method. The 
right-hand panel shows the H7-line profile parameters c 7 and b 7 for the same stars as in 
the left panel. Here, BS and BHB stars can be distinguished clearly through their bimodal 
distribution in this plane. The enclosed region indicates the H 7 scale width-shape criteria 
that selects BHB stars. Our BHB sample is composed of all stars satisfying both criteria 
(left-hand and right-hand panels) simultaneously. This leaves a sample of 4985 objects with 
a high probability of proper classification as BHB stars (see X08). 
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Fig. 2. — Sample properties for the 4625 stars with high probability of being BHB (Figure 
[T|. The first two plots show the Galactic sky coverage of the BHB sample, where stars are 
colored according to radius and line-of-sight velocity. The spatial distribution (x-z plane) is 
shown as the third panel. The coordinate system has its origin at the Galactic center; the 
large filled circle on the x-z plot indicates the location of the Sun (8.0 kpc, kpc); and the 
stars are coded according to line-of-sight velocity. 
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Fig. 3. — The cumulative distribution of BHB stars with distance from the Galactic center, 
r gc , with a median distance of 22 kpc. About 90% of the sample lies between 5 kpc and 40 
kpc. 
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Fig. 4. — (Upper Panel): The close pair distribution, N(< F), for the 4243 BHB stars in the 
SDSS DR8 BHB sample with \Z\ > 4 kpc and r gc < 40 kpc. F is the four-space separation 
between two BHB stars, taking into account in angle, distance, and line-of-sight velocity 
(Eq. 1). The solid line is the cumulative distribution of F as observed; the dashed line 
is the average cumulative distribution of F for 100 null hypotheses, where positions, and 
hence, angular separations for each pair, were retained exactly as in the observations, but 
distances and line-of-sight velocities were scrambled (see Section 3). The filled circles devote 
the mean of 100 such null hypotheses; the thick error bars enclose 68% of the distribution, 
while the thin error bars enclose 95% of the null hypotheses. For small F one might expect 
N(< F ) oc F 2 for the null hypothesis, but the plot shows a somewhat shallower slope, 
presumably arising from the sparse, but locally dense, angular sampling that results from 
the widely spaced SEGUE-1/2 spectroscopic plates. (Lower Panel) Ratio of the cumulative 
distribution, defined as the number of pairs in the BHB sample divided by the average number 
of null hypotheses below a certain F, N G b s (< F)/~N nu u(< F). The thick and thin error bars 
are derived from those on upper plot by propagation of error. Both plots demonstrates that 
there exists a significant excess of close pairs (in distance and velocity) compared to the null 
hypothesis; BHB stars in our sample clearly exhibit position-velocity substructure. 
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Fig. 5. — The ratio of the cumulative distribution for the BHB sample after either scrambling 
only the distances (dashed line, case I), or only the velocities (dotted line, case II), or both of 
them (solid line, case III). In cases I and II an excess of close pairs is observed at a comparable 
level, but is weaker than that of case III. This implies that the substructure signal arises in 
comparable parts from both the distance and the line-of-sight velocity domains. 
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Fig. 6. — The ratio of the cumulative distribution of F for BHB stars in two broad Galac- 
tocentric distance ranges: subsample I, which covers 5 kpc < r gc <20 kpc (dotted line), 
subsample II, which covers 20 kpc < r gc < 60 kpc (solid line). The excess of close pairs 
is observed in both subsamples. The plot illustrates that a position-velocity substructure 
signal is present in both distance ranges, covering the inner and outer stellar halo, and the 
substructure signal is more pronounced at large radii. 
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Fig. 7. — The sky coverage for the BHB sample and 11 mock-observations from BJ05. The 
simulations were sampled in angular coverage and distance distribution to resemble the actual 
BHB sample. The stars are coded according to line-of-sight velocity. This figure shows that 
the velocity distributions between observation and the simulations differ somewhat. There 
are more stars with |Vi os | > 250kms _1 in the simulations. 
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Fig. 8. — The upper panel is the close pair distribution for the observed BHB sample and 
the 11 simulations. The solid line is the cumulative distribution of F for the observed 
BHB sample; the dashed lines are the F distributions for the mock-observations of the 11 
simulations. Overall, the observations fall well within the range of expectation from the 
BJ05 simulations, but the simulations have somewhat more mid-scale power (logF ~ - 
2.5 to -1) than the observations. The lower panel shows the data-model comparison for 
the position-velocity substructure. We show the ratio of the cumulative distribution for 
observations (solid line) and 11 mock-observations from BJ05 (dashed lines). This figure 
shows that all simulations exhibit position- velocity clustering as an excess of N b. s {< F) for 
small F. The observed ratio of the cumulative distribution is smaller than most of those seen 
in the simulations, where the halo is exclusively made up from disrupted satellites. 
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Fig. 9. — The ratio of the cumulative distributions for the BHB sample and particles older 
than 11 Gyr in each of the 11 BJ05 simulations. The solid lines are the ratio of the cumu- 
lative distributions for the observation and the dashed lines are the ratios of the cumulative 
distributions for the 11 mock-observations. Clearly, the observation is comparable to the 
older parts of most simulations (except 2 BJ05 halos). 
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Fig. 10. — The ratio of the cumulative distributions for BHB stars and the 11 BJ05 simula- 
tions in regions that should be dominated by the outer-halo (subsample II: 20 kpc < r gc <60 
kpc; solid lines) and the inner-halo (subsample I: 5 kpc < r gc <20 kpc; dotted lines) pop- 
ulations, respectively. The figure shows that the outer halo exhibits a more pronounced 
substructure signal than the inner halo for most simulations. 
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